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Preface 

Finding  an  objective  measure  of  speech  intelligibility 
has  long  been  a  goal  of  the  speech  communications  analyst. 
Several  automated  methods  have  been  developed,  but  none  have 
performed  as  an  overall  speech  intelligibility  measure  for 
widespread  application  to  speech  communications  systems. 
Subjective  listener  testing  remains  the  most  reliable  method 
of  measuring  speech  intelligibility. 

With  the  appearance  of  linear  predictive  coding  in 
communications  theory  and  in  the  field  of  speech  synthesis, 
it  is  believed  that  this  method  coaid  be  applied  to  speech 
intelligibility  measurements.  This  study  examines  the  use 
of  linear  predictive  coding  for  objective  intelligibility 
scoring  and  develops  a  metric  for  that  purpose. 

1  am  deeply  indebted  to  Mrs.  Alisa  Workman  of  the 
Biological  Acoustics  Branch  of  the  Air  Force  Aerospace 
Medical  Research  Laboratory  for  her  hours  of  assistance  in 
the  preparation  of  the  voice  data  tape  and  the  subjective 
listener  testing.  I  wish  to  thank  Mr.  Richard  Me  Kinley  for 
the  use  of  the  acoustics  laboratory  and  its  equipment.  I 
also  wish  to  thank  Captain  Larry  Kizer,  my  advisor,  for  his 
guidance,  assistance,  and  encouragement  during  this  study; 
and  thanks  also  to  Dr.  Matthew  Kabrisky  and  Major  Ken  Castor 
for  their  guidance,  assistance,  and  review  of  this  report. 
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A  scoring  metrio  of  speech  intelligibility  based  on 
linear  predictive  coding  (LPC)  was  developed  and  evalnated. 
The  data  base  used  for  evaluating  the  metric  consisted  of  a 
list  of  SO  words  from  the  Modified  Rhyme  Test.  The  list  was 
transmitted  over  a  LPC-10  Vocoder  with  no  background  noise. 
The  list  was  scored  subjectively  for  intelligibility  by  a 
trained  listener  panel.  The  subjective  scores  were  used  to 
judge  the  effectiveness  of  the  objective  metric. 

The  LPC  scoring  metric  was  calculated  for  the  list  of 
words  and  compared  to  the  subjective  scoring.  The  intelli¬ 
gibility  score  for  the  objective  scoring  metric  was  82.99% 
with  a  standard  deviation  of  14.41%.  The  score  for  the 
subjective  listener  testing  was  84.91%  with  a  standard 
deviation  of  7.47%.  This  shows  a  possible  correlation 
between  the  objective  LPC  scoring  metric  and  standard  sub¬ 
jective  listener  scoring  methods. 


TI1E  CORRELATION  BETWEEN  SUBJECTIVE 
AND  OBJECTIVE  MEASURES  OF  CODED 
SPEECH  QUALITY  AND  INTELLIGIBILITY 
FOLLOWING  NOISE  CORRUPTION 

I  .  Introduction 

There  exists  a  need  within  the  military  to  measure  the 
quality  and  intelligibility  of  speech  at  the  output  of  a 
communication  system.  Many  methods  exist  to  measure  para— 
''footers  such  as  s  i  gna  1- 1  o-no  i  s  e  ratio  and  channel  noise; 
however,  there  are  very  few  methods  available  to  quickly  and 
easily  measure  quality  and  intelligibility  of  the  speech 
output.  These  qualities  are  highly  subjective  and  current 
methods  of  measurement  involve  many  manhours  of  tosting  and 
evaluation.  A  quicker  and  more  efficient  method  is  needed 
to  calculate  these  measures.  Situations  where  speech 
intelligibility  and  quality  measurements  are  needed  include 
the  evaluation  of  similar  voice  communications  equipments 
and  the  evaluation  of  a  system's  ability  to  withstand 
jamming  or  other  corruption  without  loss  of  the  transmitted 
message.  Intelligible  communications  are  vital  to  the 
military  in  all  areas  of  operations.  The  purpose  of  this 
thesis  is  to  determine  an  objective  measure  of  speech  that 
can  be  used  as  a  metric  of  speech  intelligibility. 

Current  Subjective  Measures 

There  are  presently  two  major  methods  ol  testing  for 
intelligibility.  These  two  methods  are:  subjective 

listener  testing,  and  objective  measurement  techniques. 
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These  two  titles  are  very  broad  in  their  meaning  and,  to 
understand  speech  intelligibility,  a  more  detailed  knowledge 
is  needed  of  these  methods.  In  a  report  by  Chambers  (Ref 
2),  the  entire  area  of  subjective  listener  testing  was 
reviewed.  This  section  is  a  summary  of  the  main  points 
found  in  that  report. 

Subjective  listener  testing  is  a  method  in  which  a 
talker  reads  some  form  of  a  test  over  a  communications 
system  to  a  panel  of  listeners.  The  listeners  respond  to 
the  talker;  and  by  evaluating  their  responses,  a  determin¬ 
ation  is  made  as  to  how  intelligible  the  communications 
system  functions.  The  types  of  tests  used  by  the  talker 
vary  depending  on  the  parameter  being  tested.  Five 
categories  of  these  tests  have  evolved:  articulation  tests, 
intelligibility  tests,  speech  comprehension  tests,  speech 
interference  tests,  and  subjective  appraisal  tests. 

Articulation  tests  are  speech  sound  recognition  tests. 
They  use  single  speech  sounds  or  phonemes  which  have  no 
normal  linguistic  distributional  properties  and  carry  no 
meaning.  They  are  not  words  bat  just  sounds.  These  sounds 
are  difficult  to  understand,  and  their  primary  use  is  in 

evaluating  two  or  more  relatively  good  communications  sys— 

\ 

tens.  The  best  known  test  of  this  type  is  the  Nonsense 
Sy  liable  Test. 

Intelligibility  tests  are  speech  perception  tests. 
They  are  used  to  evaluate  the  ability  of  a  communications 
system  to  correctly  convey  speech  in  the  form  of  messages. 
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They  consist  of  a  series  of  words,  short  phrases,  or  sen¬ 
tences  which  the  listeners  hear  and  attempt  to  identify. 
Intelligibility  is  scored  as  the  percentage  correctly  iden¬ 
tified.  There  are  presently  twelve  major  tests  of  this 
type,  two  of  which  are  the  Fairbanks'  Rhyme  Test  and  the 
Modified  Rhyme  Tost.  Table  I  shows  a  listing  of  fifty  words 
used  in  one  test  run  of  the  Modified  Rhyme  Test.  Colnan  1 
would  be  the  actual  word  spoken  by  the  talker,  while  the 
listeners  would  select  from  one  of  the  six  in  that  row. 

Speech  comprehension  tests  differ  from  intelligibility 
tests  in  that  the  listener  must  do  more  mental  processing. 
More  information  is  passed  to  the  listener  before  a  response 
is  required.  The  listener  must  then  use  the  accumulated 
information  to  make  a  decision.  Because  the  human  thought 
process  has  a  limited  input  data  rate,  these  tests  are  used 
to  examine  a  communications  system  for  degradation  when 
additional  mental  tasks  such  as  flying  an  aircraft  are 
performed.  Three  tests  in  this  area  are  the  Message  Rate 
Efficiency  Test,  the  Single  Answer  Sentence  Test,  and  the 
Repetition  Rate  Test.  An  example  would  be  tho  Single  Answer 
Test,  in  which  a  simple  question  requiring  a  one  phrase 
response  (such  as,  "  V hat  is  your  altitude?'')  is  posed  to 
the  listener.  The  percentage  of  correct  answers  yields  the 
percent  sentence  compehension  or  speech  comprehension. 

Speech  interference  tests  are  speech  audibility  tests. 
They  are  used  to  show  the  ability  of  a  listener  to  correctly 
hear  phonemes  or  speech  sounds  sent  over  a  communications 
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system.  These  tests  are  more  objective  than  other  tests 
because  they  are  based  on  signal-to-noise  ratios.  These 
s i g n a  1 - t o- n o  i  s e  ratio  curves  are  plotted  for  various  fre¬ 
quencies  over  the  audible  range  and  these  curves  are  used  as 
performance  curves  of  intelligibility.  The  most  common 
interference  test  is  the  Articulation  Index,  or  AI.  This 
test  is  a  very  lengthy  and  detailed  process,  but  it  has 
become  one  of  the  more  commonly  used  curve  plotting  methods. 
The  Acoustical  Society  of  America  has  written  a  complete 
standard  on  the  proper  method  of  calcuation  of  the  Articula¬ 
tion  Index  (Ref  1).  Although  these  tests  are  objective, 
they  are  classified  as  listener  tests  because  they  are  not 
automatically  calculated.  They  require  plotting  and  evalua¬ 
tion  by  humans.  This  evaluation  is  a  subjective  evaluation. 

The  fifth  group  of  tests  are  the  Subjective  Appraisal 
Tests.  These  tests  require  the  listener  to  give  an  opinion 
of  the  communications  system’s  performance.  The  listener 
judges  the  quality  of  the  received  speech.  These  testa  are 
based  on  the  confidence  the  listener  has  in  what  he  h  ear  d 
and  the  effort  required  to  understand  the  received  speeoh. 
In  the  Confidence  Ratings  Teat,  the  listener  rates  the 
confidence  he  has  in  what  he  heard.  It  uses  a  scale  from 
'positively  received  the  message  correctly'  to  'positively 
received  the  message  incorrectly.'  These  tests  are  used  on 
low  intelligibility  systems. 

P  jr  o  b 

When  subjective  testing  methods  are  used  to  evaluate 


communications  systems,  they  produce  highly  repeatable  and 
useable  results.  The  major  drawback  is  that  subjective 
testing  is  very  expensive  in  both  manpower  and  monetary 
costs.  The  talkers  and  listeners  must  be  trained  in  the 
testing  methods,  and  then  they  must  undergo  hundreds  of 
hours  of  tests  to  obtain  valid  results.  A  typical  testing 
panel  consists  of  one  talker,  nine  listeners,  and  one  con¬ 
troller  to  run  the  test  and  evaluate  the  results.  Special 
rooms  are  used  to  control  background  noise,  and  noise  gener¬ 
ators  are  required  to  add  the  necessary  background  noise 
found  in  the  environment  of  the  system  being  tested.  All 
this  is  very  costly  when  a  system  might  be  tested  several 
times  for  just  minor  alterations. 

A  second  drawback  of  subjective  testing  is  that  it 
cannot  be  easily  performed  in  the  actual  system  environment. 
Simulated  conditions  are  used  because  such  places  as  air¬ 
craft  cockpits  or  battlefields  prohibit  on  scene  listener 
testing. 

In  1969,  the  IEEE  published  the  "Recommended  Practice 
for  Speech  Quality  Measurements."  In  this  practice,  the 
IEEE  stated  that  since  the  start  of  the  research  for  the 
p.iper,  no  generally  applicable  method  of  preference  measure¬ 
ment  had  been  developed  (Ref  9:227).  Wo  standard  method  of 
intelligibility  scoring  has  been  developed  that  can  be  used 
as  an  overall  guide  to  system  performance.  What  is  needed 
is  a  method  to  easily  and  quickly  calculate  an  intelli¬ 
gibility  score  for  a  given  system.  It  should  be  as  close  to 
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real  time  operation  as  possible  and  require  a  minimum  of 
personnel  for  proper  operation.  What  is  needed  is  a  method 
to  calculate  an  objecive  measure  of  the  system  performance 
that  can  be  used  as  a  score  of  intelligibility.  This  method 
must  be  automated  and  correlate  with  subjective  scores  on 
the  same  system. 

Current.  Objective  Measures 

Much  research  is  being  conducted  to  find  an  objective 
measure  to  replace  subjective  testing.  Chambers'  report 
(Ref  2)  lists  five  automated  methods  in  use  in  1973.  These 
earlier  methods  were  the  Pattern  Correspondence  Index,  the 
Speech  Communication  Index  Meter,  the  Voice  Interference 
Analysis  Set,  the  Automated  Intelligibility  Measurement,  and 
the  Sound  Level  Meter.  Since  1973,  other  methods  such  as 
the  Automatic  Intelligibility  Test  Equipment  and  linear 
prediction  have  been  developed. 

The  Pattern  Correspondence  Index  measures  the  input  and 
output  of  the  system  and  integrates  the  difference  between 
the  two  waveforms  over  the  duration  of  the  speech  period. 
The  integrated  signal,  presented  as  a  meter  reading,  was 
related  to  the  intelligibility  of  the  system.  The  major 
problem  was  that  the  slightest  delay  in  the  system  caused 
the  two  waveforms  to  be  unmatched  and  the  index  failed. 

The  Speech  Communication  Index  Meter  (SCIM)  is  a  device 
that  automatically  measures  the  Articulation  Index  (Ref 
8:18).  An  earlier  device  called  the  Voice  Interference 
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Analysis  System  (YIAS)  also  calculated  the  Articulation 
Index  but  did  not  perform  in  a  reliable  manner.  The  SCIM  is 
a  newer  system  modeled  on  the  VIAS  system.  The  VIAS  system 
has  very  limited  use  while  the  SCIM  can  be  extremely  helpful 
in  analysis  of  the  tine  varying  aspects  of  communications 
systems . 

The  Automated  Intelligibility  Measurement  (AIM)  uses 
computerized  speech  recognition  as  &  technique  to  find  word 
intelligibility  scores.  It  is  a  method  of  phoneme  matching 
in  which  a  phoneme  is  sent  over  the  system  and  the  output  is 
compared  to  a  set  of  phoneme  recognition  ’masks'  based  on 
actual  subjective  testing.  This  method  was  developed  into  a 
system  called  Automatic  Intelligibility  Test  Equipment 
( A I T  v; )  as  a  computer  software  package  for  the  0.  S.  Air 
Force  (Ref  10).  This  method  is  quite  lengthy  and  time 
consuming.  A  complete  set  of  masks  is  needed  for  each 
method  and  level  of  corruption  used.  The  computer  must 
compare  the  output  phoneme  with  every  mask  and  then  deter¬ 
mine  the  closest  match.  This  system  could  never  work  in  a 
real  time  environment. 

The  Sound  Lovel  Meter  is  a  very  simple  tool  that  has 
been  used  to  evaluate  the  impact  of  acoustic  noise  on  speech 
communications.  The  major  drawback  of  the  meter  is  that  it 
is  an  averaging  device  and  the  same  meter  readings  may  be 
obtained  for  a  wide  variety  of  spectrum  shapes. 

The  newest  area  in  automatic  testing  is  Linear 
Prediction.  "The  basic  idea  behind  linear  predictive  aua- 
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lysis  is  that  a  speech  sample  can  be  approximated  as  a 
linear  combination  of  past  speech  samples.  By  minimizing 
the  sum  of  the  squared  differences  (over  a  finite  interval) 
between  the  actual  speech  samples  and  the  linearly  predicted 
ones,  a  unique  set  of  predictor  coefficients  can  be  deter¬ 
mined''  (Ref  14:396).  In  a  system  being  designed  by  Gamanf 
and  Hartman  (Ref  5),  and  later  by  Ilartman  and  Boll  (Ref  5), 
several  different  combinations  of  these  predictor  coeffi¬ 
cients  are  being  used  to  form  an  intelligibility  score.  No 
clear  and  exact  method  has  been  found  to  use  these  coeffi¬ 
cients  vu  produce  a  reliable  and  general  method  for  scoring 
speech  quality  and  intelligibility.  However,  there  has  been 
limited  success  in  specific  applications  as  is  outlined  in 
these  reports. 

G ene_r ji_l  Approach  and  AssangUons 

The  purpose  of  this  report  is  to  describe  a  method  of 
automatic  intelligibility  scoring  using  the  objective  mea¬ 
sures  of  linear  prediction.  Linear  prediction  was  selected 
because  it  is  a  new  method  and  research  has  shown  promise  in 
using  linear  prediction  for  this  purpose.  It  can  be  easily 
implemented  on  a  computer  and  is  geared  toward  the  digital 
communications  domain.  Since  the  Department  of  Defense  has 
decided  to  switch  to  digital  voice  communications,  a  scoring 
system  or  metric  is  needed  that  performs  in  the  digital 
dona i n . 

This  study  is  very  limited  and  will  involve  only  the 
testing  of  one  communications  system,  a  real-time  linear 
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predictive  vocoder  currently  under  test  by  the  U.  S.  Air 
Force.  Since  this  vocoder  is  available,  it  will  be  used  as 
the  sample  digital  communications  system  under  test.  The 
assumption  will  be  made  that  this  vocoder  is  a  good  model  of 
a  digital  communications  system  and  could  be  of  future  use 
by  the  Department  of  Defense.  Since  linear  predictive  vo¬ 
coder  techniques  are  under  intense  study  by  all  branches  of 
the  military,  it  is  felt  that  this  is  a  good  assumption. 
This  report  will  outline  the  theory  of  linear  prediction, 
the  procedures  used  to  generate  test  data  and  subjective 
scores  on  the  vocoder,  and  implementation  of  the  linear 
predictive  coding  metric.  The  results  of  tests  using  the 
metric  will  be  analyzed  and  correlated  with  the  results  of 
the  subjective  scores.  Finally,  conclusions  will  be  drawn 
on  these  results  and  recommendations  will  be  made  as  to  the 
performance  of  the  metric  and  on  further  study. 


1 I  •  LAs.®  a.£  Predictive  Coding 

Linear  predictive  coding  (LPC)  is  a  rapidly  growing 
area  of  interest  in  the  communications  field.  Various  areas 
of  application  include  voice  encoders  (or  vocoders),  speech 
recognition,  and  speaker  identification.  Although  linear 
prediction  has  only  had  widespread  use  in  the  past  fifteen 
years,  it  dates  back  to  Gauss  in  179S  (Ref  12:10)  under  the 
more  descriptive  title  of  linear  least  squares  estimation. 
In  1975,  John  Makhoul  published  a  tutorial  review  of  linear 
prediction  in  the  IEEE  Proceedings  (Ref  11).  This  chapter 
is  based  mainly  on  that  review  and  presents  the  basic  theory 
behind  linear  prediction  and  the  autocorrelation  solution 
algorithm  of  the  linear  prediction  analysis  model. 

Model 

The  first  step  in  the  description  of  linear  prediction 

is  to  discnss  the  model  it  is  based  upon.  In  applying 

discrete  time  series  analysis  to  speech,  each  continuous'' 

time  signal  s(t)  is  sampled  to  obtain  a  discre to-time  signal 

s(nT)  where  n  is  an  integer  variable  and  T  is  the  sampling 

interval.  The  sampling  frequency  is  then  1/T.  Henceforth, 

s(nT)  shall  be  abbreviated  by  s(n)  or  s  with  no  loss  in 

n 

general ity. 

Consider  a  model  in  which  the  signal  s(n)  is  the  output 
of  a  system  with  an  unknown  input  u(n)  such  that: 

s ( n )  =  -)a  s(n-k)  +  G)b.u(n-l),  b  =1  (2.1) 

u  k  ,L  1 

k=l  1=0 
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where  a^,  l£k.£.p,  bj,  l<.l<.q»  and  G  are  the  parameters  of  the 
hypothesized  system  and  s(n-k)  are  the  past  outputs.  Equa¬ 
tion  2.1  states  that  the  output  s(n)  is  a  linear  combination 
of  past  outputs  and  past  and  present  inputs,  and  thus  is 
predictable  from  these  past  outputs  and  inputs.  This 
results  in  the  name  linear  prediction. 

Equation  2.1  is  in  the  time  domain.  By  taking  the  Z 
transform  of  both  sides  and  regrouping  terms,  the  frequency 
domain  model  is  given  as: 


H(z)  =  It!)  =  G[[1+2bi2"1]/[1+! 


(2  .2) 


where 


UJ 

s  (  z )  =  ^s(n)z 


(2.3) 


is  the  Z  transform  of  s(n),  and  U  ( r. )  is  the  Z  transform  of 
u(n).  Equation  2.2  is  the  generalized  pole-zero  model  of 
the  vocal  tract.  There  are  two  special  cases  of  this  model 
that  are  of  interest.  The  first  case  is  the  oil-zero  model 
where  a  ^  =  0  ,  1  i.  k  p  .  This  is  known  as  the  moving  average 
model.  The  second  case  is  the  all-pole  model  where  b  ^  =  0  , 
1  <.  1  <_  q .  This  is  referred  to  as  the  autoregressive  model. 


The  general  pole-zero  model  is  then  called  the  auto- 
regressive  moving  average  model.  The  most  widely  used  model 


12 


.J'tLili  ’!. 


for  linear  prediction  of  speech  is  the  all-pole  model,  in 
which  the  numerator  of  Equation  2.2  is  1.  This  will  be  the 
model  used  in  this  chapter 

L  i  ne  a  r  Pred  i  9  t  ion  o_f  Speech 

Linear  prediction  of  speech  is  based  upon  the  idea  that 
a  sample  of  a  speech  signal,  s(n),  can  be  approximated  by  a 
weighted  sum  of  the  precoding  p  samples  of  speech,  where  p 
is  an  integer.  This  yields  the  mathematical  expression  for 
s ( n)  as: 

s(n)  2.  .  s(n-i)  (2.4) 

i  =  l  1 

where  it  is  assumed  that  s(n)  is  the  nth  sample  value  of  a 
speech  signal,  s(t),  sampled  every  T  seconds.  Equation  2.4 
is  an  approximation  to  the  speech  signal  and,  thus,  is  not 
exact.  The  error  between  the  exact  nth  sample  and  its 
approximation  can  be  defined  as: 

e(n)  =  s ( n  )  -  ^a.s(n-i)  (2.5) 

i  =  1  1 

The  purpose  of  linear  prediction  is  to  find  the  weights 
(called  predictor  coefficients)  that  will  minimize  this 
error  in  some  sense  for  a  specified  time  interval. 

The  minimization  technique  selected  is:  the 
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minimization  of  the  total  squared  error  over  a  specified  time 
interval.  This  total  squared  error  is  defined  as  E,  and  is 
found  by  minimizing  the  expression: 

E  =  J[.U>  -  ^ais(n-i)j2  (2.6) 
n  i  =  l 

where  the  limits  on  n  will  define  the  interval  over  which 
the  squared  error  is  to  be  minimized.  These  limits  will  be 
discussed  in  detail  later.  To  minimize  Equation  2.6,  the 
partial  derivative  is  taken  with  respect  to  each  predictor 
coefficient,  a  ^ ,  and  the  result  is  set  equal  'to  zero.  Doing 
this,  the  following  result  is  obtained: 


^2[s(n)~2aks(n-k)][-s(n-i)]  =  0  (2.7) 

n  k  =  l 


where 

isl ,  2  ,  .  .  .  ,  p 

Rearranging  terms  and  the  order  of  summation  in  2.7  gives 
the  result: 


n- k ) s ( n- 


i) 


-^s(n) s ( n-  i  ) 

n 


(2.8) 


Now  the  limits  on  n  must  bo  defined  to  solve  Equation  2.8. 

The  selection  of  the  solution  technique  specifies  the 
limits  on  n.  Two  solution  techniques  outlined  by  Markel  and 
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Gray  (Ref  12)  are  the  Covariance  and  the  Autocorrelation 
Methods.  In  the  Covariance  Method,  the  minimization  of  E  is 
defined  for  the  interval  of  n  =  0,  1,  ....  N-l  consecutive 
samples.  The  Autocorrelation  Method  defines  the  minimiza¬ 
tion  of  E  for  -®<n<-t-».  For  this  method,  the  speech  signal 
is  defined  as: 


s  ( u ) 


s(n),  n  *  0 , 1 ..... N-l 
0,  otherwise 


(2.9) 


This  is  done  by  using  a  window  on  the  s(n)  signs?  of  a 
length  N. 

For  this  study,  the  Autocorrelation  Method  was  selected 
because  it  insures  a  stable  model  (Ref  12:130)  and  requires 
fewer  calculations  in  the  solution. 

Autocorrelation  Method 

Using  the  Autocorrelation  Method,  Equation  2.8  can  be 
rewritten  as: 


}ak  }s(n-k)s(n-i) 
k  =  l  j  =-«■> 


Letting  j=n-i  yields: 


— ^s ( n ) s ( n— i ) 
J=-® 


(2 . 10) 


K5 

k=i  j  = 


s ( j  +  i-k) s ( j  ) 


-J. ( j  +  i) s  (  j  ) 

j=-«. 


(2.11) 


and  the  estimate  of  the  autocorrelation  function  of  the 
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signal  s ( n )  is: 


CO 

R(i)  =  ^s(n)s(n+i)  (2.12) 

n  =  -a> 


where 


R( i)=R(-i) 

Using  the  previous  definition  of  s(n)  from  Equation  2.9 
yields: 


N-l-i 

R ( i )  =  2s(n)s(n+i)  (2.13) 

n  =  0 


when  i  =  1 ,  2.  ...»  p.  Equation  2.13  is  defined  as  the  short¬ 
term  autocorrelation  of  s(n).  Using  this  in  Equation  2.11 
yields : 


kR( i-k) 


-R(  i) 


i  =  1 ,2 


.  P 


(2.14) 


After  the  short-term  autocorrelation  is  computed.  Equa¬ 
tion  2.14  represents  p  linear  equations  that  can  be  solved 
simultaneously  for  each  a^.  A  recursive  solution  has  been 
developed  by  Levinson  that  provides  computational  efficiency 
in  solving  these  equations  (Ref  12). 
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Levinson*  s  Algorithm 

Levinson's  algorithm  was  selected  by  Markel  and  Gray 
for  use  in  their  Fortran  IV  subroutine  AUTO  (Ref  12:216), 
which  provides  a  recursive  solution  to  Equation  2.14.  The 
following  definitions  are  made  to  simplify  notation: 

A^P)  =  The  ith  predictor  coefficient  of  the  pth  order  model 
r(n)  ■=  Normalized  short-term  autocorrelation  coefficients 
where 


r  ( n ) 


R(n) 

R(0) 


(2.15) 


With  these  definitions.  Equation  2.14  can  be  written  as: 


-r(  j) 


=  }A*p) r(  i- j  ) 
i  =  l 1 


j  -  0.1 


P 


(2.16) 


To  start  the  algorithm. 


define  a  new  quantity,  K^,  as: 


(0)  _  r(l) 

0  “  r(0) 


(2.17) 


and  recursively  calculate  K 


(P> 


using: 


P~1 


KQp)[r(0)-^K[P_1)r(p-i)]  =  r(p+l) 


i  =  0 


'2Ki-ir(i) 
i  =  l 


(2.18) 


and 


K(p_1) 
p-  1 


^**1 »  2  »  •  •  .  *  p 


(2.19) 


Raving  calculated  k*p)  and  defining  Ag0)=l,  a|p+1)  can  be 
calculated  from: 


A(p+1 ) [ r  *  0  * ~^Kj  P*r(P+1_i>]  =r (p-1 ) -^a! p ) r (p+1- i ) 

'  i=0J  itoJ 


(2.20) 


and 


(p  +  1)  -  A(p)-ir(p)  A (p  +  1 )  4=01  /i  -.ix 

i  Ai  Ki  A(p+1)  '  0,l,...,p  (2.21) 


This  method  generates  the  two  vector  quantities  A^P^  and 
K(p),  a*P*  is  a  vector  of  the  predictor  coefficients  for 
the  filter  model.  K^P^  is  a  vector  of  the  reflection  coef¬ 


ficients,  Each  K. 

J 


<P> 


is  analagous  to  the  reflection  co¬ 
efficients  of  a  p-section  transmission  line.  If  any  trans¬ 
mission  line  reflection  coefficient  is  greater  than  1,  the 
circuit  is  unstable.  This  is  also  true  for  K*p);  the  Levin¬ 
son  algorithm  generates  all  K*p)  values  less  than  1  and 
therefore  yields  a  stable  model. 

The  third  quantity  generated  by  the  algorithm  is  the 
minimum  total  squared  error  for  the  model.  Setting  E0  =  l  and 
recursively  solving  for  Ep+1»  the  following  equation  is 
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generated: 


r 

p+1 


E  +  A 
P 


(p+1) 
(  p  +  1  ) 


[r<P+1)-2K^P) 
i  =  0 


(2.22) 


Thi s  me  thod  of 
incorporated  in  the 
Gray  and  is  listed 


calculation  of  these  three  quantities  is 
Fortran  IV  subroutine  AUTO  by  Markel  and 
in  the  appendix  of  this  report. 
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III.  Procedure 


In  order  to  study  Linear  Prediction  as  a  method  of 
objectively  determining  intelligibility,  a  communications 
system  was  used  that  was  also  under  testing  by  a  subjective 
listener  panel.  The  system  being  tested  was  the  LPC-10 
Vocoder  developed  at  Lincoln  Laboratory,  Massachusetts 
Institute  of  Technology.  This  system  was  a  microprocessor 
realization  of  a  linear  predictive  vocoder  and  is  described 
in  detail  in  Reference  7.  It  used  a  tenth-order  linear 
predictive  approximation  of  the  input  signal,  and  was  oper¬ 
ated  at  2400  bit?-  per  second.  Two  major  points  of  using 
this  system  were  that  it  was  a  real-time  digital  communica¬ 
tions  system  and  subjective  listener  testing  was  available. 

Dji_t  a  Tajjje  Record  ing 

The  first  step  in  the  anaysis  process  involved  the 
recording  of  speech  before  and  after  it  was  processed  by  the 
vocoder.  This  recording  was  done  at  W r  i  g h t - P a 1 1 e r s o n  Air 
Force  Base,  in  the  facilities  of  the  Biological  Acoustics 
Branch  of  the  Air  Force  Aerospace  Medical  Research 
Laboratory.  This  facility  was  also  testing  the  vocoder 
using  subjective  listener  tes  ig. 

Figure  1  shows  the  general  arrangement  of  equipment 
used  to  record  the  speech  tape.  The  laboratory  consisted  of 
ten  talker/listener  desk  nodules  inside  a  soundproof  room 
that  was  equipped  with  the  Air  Force  ESD  381  Jammer  #1. 
This  system  is  capable  of  generating,  inside  the  room. 
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cockpit  noises  that  simulate  various  conditions  for  aircraft 
used  by  the  Air  Force.  No  background  noise  was  used  for  the 
speech  tape  recording  so  that  the  no-noise  operation  of  the 
vocoder  could  be  tested. 

A  four  track  tape  recorder  was  used  to  record  both  the 
input  and  output  of  the  vocoder.  A  trained  male  talker  sat 
in  the  room  wearing  a  standard  Air  Force  helmet  with  the  5P 
mask  ana  M-101XA/C  microphone.  A  Hewlett  Packard  9848A 
computer  was  used  to  control  the  testing.  It  had  300  words 
from  the  Modified  Rhyme  Test  stored  in  memory.  This  compu¬ 
ter  was  also  used  to  administer  the  Modified  Rhyme  Test  for 
subjective  listener  testing. 

One  complete  run  of  the  Modified  Rhyme  Test  was  con¬ 
ducted  to  produce  the  speech  tape.  The  HP  9848A  computer 
created  a  display  on  the  talker  desk  module  instructing  the 
talker  to  pronounce  one  of  the  words  from  the  first  column 
of  Table  1.  Every  ten  seconds,  the  computer  changed  the 
word  until  all  fifty  words  were  spoken.  The  complete  test 
lasted  approximately  nine  minutes  for  one  fifty  word  list. 
The  computer  was  also  connected  to  a  timing  generator  for 
tape  alignment.  At  tho  start  of  each  ten  second  word  inter¬ 
val,  a  single  timing  mark  was  produced  by  tbo  timing  genera¬ 
tor.  At  tho  end  of  each  interval,  two  timing  marks  were 
generated.  A  timing  mark  consisted  of  a  rapid  transition 
from  a  zero  level  to  a  negative  peak  and  then  returned  to  a 
zero  level.  These  timing  marks  were  recorded  directly  onto 
track  four  of  the  speech  tape  for  use  in  alignment  of  words 


in  the  analysis  process  to  be  described  later. 

When  the  talker  saw  a  new  word  on  his  desk  module,  he 
spoke  the  number  of  the  word,  followed  by  a  short  phrase 
including  the  word.  An  example  for  the  first  word  on  the 
list  would  be:  ''Number  'one,'  you  will  mark  'took,' 

please."  A  carrier  phrase  was  used  as  in  the  Modified 
Rhyme  Test  because  the  talker  pronounces  the  word  different¬ 
ly  if  just  the  word  is  spoken  (Ref  15).  This  is  a  major 
reason  why  the  Air  Force  has  selected  the  Modified  Rhyme 
Test  for  use  in  subjective  listener  testing.  Also,  the 
carrier  phrase  is  used  by  test  equipment  to  maintain  a 
constant  level  (using  an  automatic  gain  control)  for  recor¬ 
ding  purposes.  The  output  of  the  talker's  microphone  was 
directly  recorded  on  track  one  of  the  tape  recorder.  It  was 
also  connected  to  the  input  of  the  LPC-10  Vocoder.  The 
output  cf  the  vocoder  was  recorded  on  track  three  of  the 
tape  recorder.  Track  two  was  used  for  any  general  comments 
by  the  administrator  of  the  test.  A  t:pe  of  a  complete 
fifty  word  test  was  produced.  Track  one  contained  the 
undistorted  words  directly  from  the  speaker.  Track  three 
contained  the  distorted  words  froo  the  output  of  the  vocod¬ 
er.  Track  four  contained  the  timing  marks.  This  tape  was 
then  taken  to  the  Signal  Processing  Laboratory  at  the  Air 
Force  Institute  of  Techology  for  computer  loading  and  analy¬ 
sis. 

Computer  Load_ing  and  A  1_  i^gnm^n  t^  o  f_  Speech  Tgpe, 

The  next  step  in  the  analysis  process  was  to  load  the 
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speech  recorded  on  tape  into  the  signal  processing  computer 
for  analysis.  Figure  2  shows  the  general  arrangement  of 
equipment  necessary  to  load  the  speech  data  into  the 
computer  memory. 

Each  word  was  played  back  by  the  tape  recorder  and 
passed  through  a  six-pole  butterworth  low-pass  filter  with  a 
3db  cutoff  frequency  of  3.2  KHz  and  rolloff  of  48  db  per 
octave.  Several  reasons  exist  for  using  the  3.2  KHz  cutoff 

point.  The  major  point  is  that  the  a  n  a  1  o  g  -  t  o- d  i  g  i  t  a  1 

sampler  operates  at  8000  samples  per  second;  and,  in  order 
to  satisfy  the  Nyquist  sampling  theorem,  the  highest  fre¬ 
quency  component  in  the  signal  must  be  less  than  4  KHz.  A 

second  point  is  that  the  first  three  formant  frequencies  of 

a  male  voice  with  an  average  vocal  tract  length  of  17  cm 
will  lie  in  the  frequency  range  of  about  250-2800  Hz  (Ref 
12:153).  Shorter  vocal  tracts  such  as  in  women  and  children 
produce  formats  in  the  range  of  about  300-3500  Hz.  This 
cutoff  of  3200  Hz  will  allow  the  first  three  formants  to 
piss  and  reject  the  higher  formants.  These  first  three 
formants  are  the  major  contributors  to  the  speech  waveform 
and  are  necessary  for  speech  intelligibility  (Ref  4:53). 
Still  further,  a  third  consideration  is  that  the  standard 
telephone  system  is  band  limited  to  a  range  of  300-3200  Hz, 
and  from  this  limit,  3200  Hz  was  used  so  that  the  test 
system  would  conform  to  use  over  standard  telephone  communi¬ 
cations  systems. 

The  recorded  words  were  played  back,  and  after  the 
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Figure  2.  Computer  Loading  of  Data  Tape 


filter,  they  passed  through  an  audio  amplifier  to  speakers. 
They  also  were  input  to  the  Cromemco  Computer,  which  sampled 
the  input  at  a  rate  of  8000  samples  per  second  and  stored 
the  samples  in  memory  in  the  form  of  88  blocks  of  digital 
data,  each  block  containing  256  samples.  The  net  result  was 
that  2.82  seconds  of  speech  were  stored  for  each  word. 

The  Cromemco  computer  was  controlled  by  the  NOVA  2 
computer.  Capt.  Paul  Finkes,  USAF,  wrote  for  the  NOVA 
several  useful  control  programs  that  allowed  data  to  be 
recorded  via  the  Cromemco  computer.  The  two  main  programs 
used  were  AUDIOHIST  and  AUDIOMOD.  These  programs  are  listed 
and  described  in  Reference  3. 

Actual  analysis  of  the  data  was  to  be  done  on  the 
ECLIPSE  S/250  computer.  Figure  2  shows  that  data  files  may 
be  transferred  from  the  NOVA  to  the  ECLIPSE  through  a  common 
disk  directory  DPO/DPOF.  Since  each  word  required  88  blocks 
of  storage  space  and  there  were  ICO  words,  50  undistorted 
and  50  distorted,  there  was  not  enough  space  on  disk  to 
store  all  100.  Therefore,  the  word  files  were  stored  on 
magnetic  tape  from  the  ECLIPSE  so  that  ample  space  would 
exist  for  processing  and  analysis  programs. 

The  program  AUDIOHIST  was  used  to  read  in  each  word 
from  tape.  If  Figure  2  is  examined  closely,  it  car  be  seen 
that  as  each  word  was  played  back,  the  timing  marks  on  track 
four  were  also  played  back.  The  switch  allowed  selection 
between  the  undistorted  words  on  track  one  and  the  distorted 
words  on  track  three. 
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Once  word  pairs  of  undistorted  and  distorted  words  were 
read  into  the  computer,  they  had  to  be  aligned  before  any 
analysis  could  take  place.  This  was  the  purpose  of  the 
timing  marks  for  each  word.  With  the  use  of  the  edit 
function  of  the  program  AUDIOMOD,  the  block  containing  the 
timing  mark  was  found.  This  was  easily  accomplished  because 
the  timing  marks  had  higher  peak  values  than  the  actual 
speech,  and  AUDIOMOD  produced  a  summary  listing  of  peak 
values  in  each  block. 

Once  the  timing  mark  blocks  were  found,  the  program 
BLOCKOUT.FR  was  used  to  print  out  a  listing  of  all  256 
sample  values  in  that  block.  A  complete  listing  of 
BLOCKOUT.FR  can  be  found  in  the  appendix  of  this  report. 
From  the  256  sample  values,  the  position  of  the  timing  marks 
was  found  for  each  word  in  the  pair.  It  was  found  that  the 
timing  mark  consisted  of  a  peak  negative  volt- ge  of  approx¬ 
imately  -3.00  volts  followed  33  samples  later  by  a  peak 
positive  voltage  of  +3.00  volts. 

Once  the  location  of  the  timing  marks  was  known,  the 
words  could  be  aligned.  The  number  of  samples  between  the 
distorted  word  timing  marks  and  the  undistoitod  word  timing 
marks  was  found  from  the  BLOCKOUT  outputs.  Then,  a  program 
culled  SUIFT.FR  was  used  to  shift  the  distorted  word  samples 
up  or  down  the  number  of  places  as  needed  to  align  the 
distorted  word  file  with  the  undistorted  word  file.  A 
listing  of  SUIFT.FR  can  also  be  found  in  the  appendix.  This 
program  was  designed  to  shift  a  speech  file  the  number  of 
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sample*  spooified  in  the  selected  direction.  It  also  zero- 
fills  as  necessary  the  end  block  from  which  the  last  values 
are  shifted  and  throws  away  the  values  shifted  out  of  the 
first  block  of  the  shift.  Thus,  when  analysis  is  done,  the 
first  and  last  blocks  should  not  be  used  if  the  shift  was 
large.  Once  the  shift  was  completed,  a  third  file  was 
created  that  contained  the  shifted  version  of  the  distorted 
word.  At  this  point,  ISO  word  files  existed,  SO  undistorted 
word  files,  SO  distorted  but  not  shifted  word  flies,  and  SO 
distorted  and  shifted  word  flies.  All  these  files  were 
stored  on  magnetic  tape  and  were  loaded  into  memory  only  for 
analysis;  they  were  deleted  from  memory  when  analysis  was 
comp  1 e  t  e  d . 

Computer  Imp  1 e m e n t a t ion  o f  LPC 

Once  all  the  words  were  aligned  and  stored,  the  analy¬ 
sis  could  be  started.  Since  Linear  Prediction  was  selected 
as  the  model  to  be  used,  two  values  had  to  be  selected. 
These  inputs  to  the  subroutine  AETO  were  N,  the  number  of 
data  samples  per  analysis  window,  and  M,  the  order  of  the 
LPC  filter  model  to  be  used. 

The  choice  of  the  cra’ysis  interval  N  is  determined  by 
the  assumption  that  the  vocal  tract  movement  was  negligible 
on  the  order  of  15-20  ms  for  most  vowels.  "Absolute  place¬ 
ment  of  a  15-20  ms  interval  will  not  substantially  affect 
the  results  of  either  the  covariance  or  the  autocorrelation 
method  in  most  instances'*  (Ref  12:156).  For  the  autocorre¬ 
lation  method,  this  meant  that  pitch  asynchronous  analysis 
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(the  arbitrary  placement  of  the  time  interval)  could  be 
used.  Since  each  sample  was  0.12S  ms,  we  could  use  128 
samples  per  window  for  a  16.0  ms  window  length.  Thus,  N“128 
was  used  for  this  analysis.  This  meant  that  each  block  of 
data  could  be  divided  in  half  and  two  sets  of  LPC  parameters 
would  be  generated  for  each  block. 

The  second  input  to  AUTO  is  M,  the  order  of  the  LPC 
filter  model.  It  is  desirable  to  use  the  lowest  order 
possible  because  the  larger  the  order  of  the  filter,  the 
more  coefficients  that  will  be  calculated  and  the  longer  the 
program  will  take.  M  is  limited  to  a  maximum  of  21  by  the 
AUTO  subroutine  due  to  limited  array  size  statements.  M  is 
also  limited  to  a  minimum  value  due  to  the  vocal  tract 
length.  Markel  and  Gray  have  Stated  this  relationship  to 
be  : 


M  = 


2 Lf  s 
c 


(3.1) 


where  L  is  the  vocal  tract  length  (previously  assumed  to  be 
17  cm),  fs  is  the  sample  rate  of  8000  samples  per  second, 
and  c  is  the  speed  of  sound,  3  4  cm/ms  (Ref  12:154).  Using 
these  values,  M  was  found  to  be  8.  Thus.  M  =  8  was  used  for 
this  analysis,  and  the  predictor  coefficient  vector  and  the 
reflection  coefficient  vector  each  contained  eight 
elements  . 

The  Fortran  IV  program  AUTOLPC.FR  was  written  to  con¬ 
trol  the  use  of  the  subroutine  AUTO.  The  listing  of 
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AUTOLPC.FR  can  be  found  in  the  appendix.  When  this  program 
is  called,  it  first  asks  for  the  filename  of  the  undistorted 
speech  file.  Then,  it  asks  for  the  distorted  shifted  speech 
file.  Next,  the  values  for  N  and  M  are  entered.  The  last 
two  entries  are  for  the  first  and  last  blocks  to  be  included 
in  the  analysis.  This  option  allows  for  future  analysis  of 
seperats  parts  of  the  words.  In  this  analysis,  the  start 
block  used  was  block  zero  and  the  end  block  used  was  the 
last  block  before  the  block  containing  the  first  end  timing 
mark . 

Once  all  entries  were  made  in  AUTOLPC.FR,  the  analysis 
started.  Starting  with  the  undistorted  word  file,  the  first 
block  was  analyzed  by  subroutine  AUTO  and  the  predictor 
coefficients,  reflection  coefficients,  and  total  squared 
error  were  returned  to  the  main  program.  Then  the  next 
block  was  analyzed  by  AUTO,  and  so  forth  until  the  last 
block  designated  was  analyzed.  Next,  the  distorted  and 
shifted  speech  file  was  analyzed  in  the  same  manner.  The 
time  required  to  analyze  both  words  averaged  two  minutes. 
After  both  words  were  analyzed,  the  main  program  used  the 
values  returned  by  AUTO  and  calculated  an  intelligibility 
score  for  that  particular  word  pair.  This  score  was  printed 
to  the  terminal  screen  to  be  recorded  by  the  operator.  The 
method  used  to  calculate  the  intelligibility  score  will  be 
discussed  in  the  next  section. 

After  all  50  word  pairs  were  analyzed  and  scores  ob¬ 
tained  for  each  word  pair,  the  50  scores  were  averaged  to 
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find  the  mean  intelligibility  score  and  the  standard  devia¬ 
tion  associated  with  the  SO  scores  about  the  mean.  These 
values  will  be  presented  in  the  results  section  of  Chapter 
IV  of  this  report. 

Calculat ion  of  Objective  Intelligibility  Score 

The  subroutine  AUTO  contains  three  variable  names 
associated  with  the  LPC  analysis  of  Chapter  II.  The 

variable  A  is  the  array  of  predictor  coefficients  the 

variable  RC  is  the  array  of  reflection  coefficients 
and  ALPHA  is  the  minimum  total  squared  error  E  ...  The 
third  variable,  ALPHA,  is  what  was  used  to  determine  the 
intelligibility.  Market  and  Gray  (Ref  12:217)  state  that 
this  minimum  total  squared  error  can  be  considered  as  resi¬ 
dual  energy  between  the  actual  signal  s(n)  and  the  predicted 
signal  generated  by  the  linear  prediction  process.  Each 
time  AUTO  is  called  and  run,  it  returns  one  value  of  ALPHA 
for  each  128  samples  used.  These  ALPHA  values  are  summed  by 
the  main  program  for  each  block  of  data  until  the  entire 

word  is  analyzed.  The  sum  of  these  ALPHA  values  is  called 

ALSUMI  for  the  undistorted  word  file  and  ALSUMD  for  the 
distorted  shifted  word  file.  They  can  be  shown  by  a  com¬ 
parison  to  the  E  +  ^  values  from  Chapter  II  as: 

ALSUMI  =  ^ALPHAI  (3.2) 

all  blocks 

ALSUMD  =  WpRAD  (3.3) 

all  clocks 


where  the  ALPHAI  are  the  individual  E  ,,  values  for  the 

P  +  1 

undistorted  word  file  and  the  ALPHAD  are  the  individual  E 

p  +  1 

values  for  the  distorted  word  file. 

Once  these  two  sums  were  found,  the  score  could  be 
calculated.  Two  combinations  of  these  values  were  used  to 
score  intelligibility.  These  two  methods  were  called  SUMA 
end  SUMB  and  are  defined  as: 

SOMA  =  [lOO.O  -  (100*0)  ]*  (3.4) 

SUMB  =  [lOO.O  -  <100. 0)  ]%  (3.5) 

Both  methods  could  be  used  to  find  an  intelligibility  score, 
but  SOME  produced  values  in  the  0-100%  range  that  could  be 
used  directly  as  an  intelligibility  score. 

Equation  3.5  can  be  rewritten  as 

^Intelligibility  =  [  1 00 . 0-[  (^Ed~ ^Eu )  /  E^]/  ( 100 . 0  )  ]  (3.6) 

where  the  E  values  are  the  E  values  of  Equation  2.22  for 
u  p  +  1 

the  undistorted  word  and  the  E,  values  are  the  E  ,  values  of 

a  P  +  1 

the  distorted  word.  Further  simplification  of  3.6  yields: 

^Intelligibility  =  [  ]>EU  ]  /  [  ])Ed  ]  X  100.0%  (3.7) 

where  the  summations  are  of  all  the  blocks  selected  for 
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IV.  Results 


In  Chapter  III,  it  was  stated  that  the  LPC-10  Vocoder 
was  tested  by  a  Subjective  Listener  Panel  to  obtain  a 
subjective  evaluation  for  comparison  with  the  objective 
measure  used  in  thig  report.  This  chapter  presents  the 
results  of  both  the  subjective  testing  and  the  objective 
testing. 

Sub.j  ect  ive  Testing  Results 

Subjective  Listener  Testing  was  performed  on  the  LPC-10 
Vocoder  at  the  facilities  of  the  Biological  Acoustics  Branch 
of  the  Air  Force  Aerospace  Medical  Research  Laboratory.  The 
Modified  Rhyme  Test  (MRT)  was  used  on  a  ten  member  listener 
panel.  The  MRT  is  used  by  the  Biological  Acoustics  Branch 
because  members  of  the  branch  feel  it  is  the  best  available 
test  that  corresponds  with  Air  Force  missions.  It  uses  a 
carrier  phrase  to  simulate  actual  communications  methods  and 
allows  six  possible  responses  for  each  word  sent.  This  has 
been  found  to  reduce  guessing  as  is  possible  in  the 
Diagnostic  Rhyme  Test,  which  has  only  two  responses  per  word 
(Ref  15). 

For  the  tests  on  the  LPC-10  Vocoder,  four  signal-to- 
noi;e  levels  were  used.  The  level  of  concern  for  this 
report  is  Level  1,  or  the  ''no  noise"  level.  At  this 
level,  there  is  no  special  background  noise  being  generated 
and  the  no-noise  characteristics  of  the  vocoder  are  being 
tested. 
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For  the  Level  1  (zero  noise)  test,  five  trials  were 


used.  In  each  trial,  one  of  the  ten  subjects  was  selected 
as  the  talker  and  the  nine  remaining  subjects  were  the 
listeners.  The  panel  was  made  up  of  both  males  and  females 
who  were  trained  talkers/listeners.  Each  trial  used  a  fifty 
word  list  similar  to  Table  I  in  Chapter  I.  Table  II  lists 
the  results  of  each  trial  in  two  forms.  The  first  value  is 
the  number  of  correct  responses  out  of  the.  fifty  words 
spoken  and  has  a  range  of  values  from  0  to  SO.  The  second 
value  is  the  corrected  intelligibility  score.  This  score 
has  been  corrected  for  guessing  using  the  correction  method 
o  f  : 

Correct  Score  =  (2.4  X  Number  correct  out  of  50)  -  20  (4.1) 

This  correction  method  is  the  standard  method  used  by  the 
Biological  Acoustics  Branch  for  all  MRT  functions  conducted 
by  that  branch. 

The  Table  II  results  show  that  for  zero  noise,  the  LPC- 
10  Vocoder  has  an  average  intelligibility  score  of  84.91% 
with  a  standard  deviation  of  7.47%.  This  subjective  score 
can  be  interpreted  to  moan  that  under  the  above  stated 
conditions,  the  system  will  be  84. 91  To  intelligible  +  7.47%. 

Three  other  noise  levels  were  also  tested  and  are 
presented  in  Figure  3.  Level  2  is  the  95  db  noise  level. 
Level  3  is  the  105  db  noise  level,  and  Level  4  is  the  115  db 
noise  level. 
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TABLE  II 

Subioctive  Listener  Testing  Results 


for  Modified  Rhyme  Test 
0  db  General  Noise  Case 
Old  Microphone/5P  Mask 


Subject 

Trial  1 

Trial  2 

Trial  3 

Trial  4 

Trial  5 

1 

40/76.00 

40/76.00 

39/73.60 

TALKER 

38/71.20 

2 

47/92 . 80 

45/88.00 

44/85 . 60 

46/90.40 

46/90.40 

3 

TALKER 

45/88.00 

44/85 .60 

41/78.40 

49/97 .60 

4 

46/90.40 

45/88 .00 

TALKER 

43/83.20 

48/95.20 

5 

45/88.00 

45/88.00 

40/76.00 

42/80.80 

46/90.40 

6 

48/95.20 

TALKER 

46/90.40 

47/92.80 

47/92.80 

7 

50/100.0 

44/  85 .60 

40/76.00 

42/80.80 

TALKER 

8 

40/76.00 

40/76.00 

36/66.40 

35/64.00 

43/83.20 

9 

47/92 . 80 

45/88.00 

41/78.40 

44/85.60 

48/95.20 

10 

48/95.20 

44/85.60 

41/78.40 

40/76.00 

47/92.80 

AVG 

45.67/89.60 

43.67/84.80 

41.22/78.93 

42.22/81.33 

45.78/89.87 

TOTAL  AVERAGE  =  84.91%  INTELLIGIBLE 
STANDARD  DEVIATION  =  7.47 


notes  : 

1.  All  scores  are  presented  as: 

(Number  correct  out  of  50/Corrected  Intelligibility  Score) 

2.  All  scores  are  corected  for  guessing  using  the  formula: 
Corrected  Score  =  (2.4  X  Number  correct  out  of  50)  -  20 
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General  Noise- 5/P  MRSK- o 1 d  mic 


Objective  Testing  Results 

Following  the  procedures  outlined  in  Chapter  III  of 
this  report,  a  total  of  50  words  were  analyzed  using  the  LPC 
measure  described.  This  measure  is  the  total  squared  error 
measure  of  Equation  3.6  and  is  called  SUMB  in  these  results. 

Table  III  presents  a  word  by  word  listing  of  results 
for  the  calculation  of  SUMB.  In  this  table,  the  column 
labeled  Block  Length  represents  the  actual  number  of  blocks 
that  were  analyzed.  For  instance,  the  word  TOOK  was 
analyzed  from  block  0  to  block  51  for  a  total  of  52  blocks. 
The  end  block  was  determined  by  selecting  the  block  con¬ 
taining  the  end  timing  mark  used  for  alignment  of  words. 
The  shorter  block  lengths  were  used  so  that  less  calculation 
time  was  needed.  Use  of  the  end  block  containing  the  timing 
mark  assured  that  all  of  the  word  was  analyzed  because  the 
timing  mark  came  after  the  word  was  spoken.  From  Table  III, 
the  average  value  of  SUMB  was  calculated  for  the  fifty 
words.  The  average  SUMB  =  82.99%  with  a  standard  deviation 
of  14.41%.  If  SUMB  is  used  as  a  score  of  intelligibility, 
it  does  correspond  to  the  subjective  score  of  84.91%  + 
7.47%.  This  shows  the  desired  correspondence  between  the 
subjective  and  objective  measures  used  in  this  report. 

Although  the  objective  score  is  well  within  one  stan¬ 
dard  deviation  of  the  subjectiv'  score,  it  should  be  ncted 
that  the  standard  deviation  of  the  objective  score  is  quite 
large.  Bocause  of  this  large  deviation,  it  would  appear 
that  the  objective  measure  could  very  easily  have  fallen 


38 


TABLE  III 


i  e  c  t ive  Intelligibility  Results 


WORD 

BLOCK 

FILE 

LENGTH 

SCMB 

TOOK 

52 

84.43 

GUST 

65 

86  .57 

GANG 

66 

65  .82 

PEACH 

63 

93  .62 

SUP 

68 

80.53 

BASS 

48 

71.58 

PACK 

66 

69.69 

PIN 

60 

90 .92 

COIL 

51 

85.08 

SAD 

51 

69.92 

DUG 

61 

76.05 

TIP 

64 

114.14 

CUFF 

66 

96.06 

GALE 

66 

70.44 

DAY 

63 

69.32 

LAW 

62 

134.66 

TEST 

56 

97 .79 

LAY 

60 

66.10 

FEAT 

66 

62 .94 

BENT 

57 

68.51 

BIG 

52 

70.10 

SUN 

42 

90.11 

HOT 

52 

65.79 

FIT 

45 

119.72 

TEASE 

52 

70.32 

WORD 

BLOCK 

EILE 

LENGTH 

SUMB 

TACK 

46 

61.74 

MAT 

44 

81 .39 

FIB 

53 

88.82 

SHOP 

64 

80.26 

WILL 

65 

85.20 

SANE 

63 

78.72 

PANE 

55 

94.32 

FEEL 

59 

78.15 

RED 

57 

83 .50 

KILL 

59 

92.71 

DIM 

61 

97 .50 

SAME 

57 

80.53 

PEN 

63 

80.54 

CAVE 

61 

74.56 

SIN 

61 

93.18 

PARK 

60 

70 . 80 

PICK 

51 

102.43 

DIN 

44 

80 .65 

BUCK 

63 

87.96 

FOLD 

61 

80.80 

PUN 

59 

87 .19 

RAKE 

64 

74.94 

BEAK 

62 

78.50 

SEED 

50 

78.63 

HEAVE 

65 

86 .29 

AVERAGE  SUMB 


(414!>. 52/50) 

82.99%  INTELLIGIBLE 


'|4f  Hint  T9® ' '  Ifl-fl'  ff'iwirv  • ' 


outside  one  standard  deviation  of  tie  subjective  score. 
Bowever,  this  is  misleading.  If  the  standard  deviation  is 
calculated  on  just  the  first  25  words,  an  average  SUMB  of 
82.81%  is  obtained  but  the  standard  deviation  is  18.61%.  It 
can  be  seen  that  the  more  vords  that  were  analyzed  the 
smaller  the  standard  deviation  became  while  the  average 
value  of  SUMB  remains  nearly  the  same.  Thus,  if  a  larger 
number  than  50,  say  200,  words  were  analyzed,  it  appears 
that  the  standard  deviation  might  be  reduced  to  an  even  more 
acceptable  level  under  10%. 


i  .,i,i  Jl^l-iiulln.tl'lljiiS^^.II  ■  u, . ||  |  * 1  ^il  1 11  If  •_ 


V. 


Cone  1  _u._s _i ^ n_s  and  Recoran endat  ions 


Conclusions 

The  major  conclusion  that  may  be  drawn  from  this  report 
is  that  the  linear  predictive  coding  total  squared  error 
metric  examined  in  Chapters  III  and  IV  does  perform  as  an 
objective  intelligibility  measure  for  the  no-noise  case  for 
the  LPC-10  Vocoder  under  test.  It  resulted  in  an  intelligi¬ 
bility  score  of  82.99%,  which  is  within  one  standard 
deviation  of  the  subjective  score  of  84.91%.  However,  this 
is  a  very  limited  result  and  should  be  tested  for  other 
systems  and  noise  levels  for  a  batter  examination  of  the 
performance  of  this  metric. 

In  a  report  by  Ottinger  in  1978,  a  metric  was  developed 
that  is  nearly  the  same  as  the  metric  used  in  this  report. 
This  was  Distance  Measure  1  in  the  Ottinger  report  (Ref 
13:25).  However,  Ottinger  found  no  correlation  between  his 
Distance  Measure  1  and  subjective  scores  for  his  systom 
under  test.  The  major  difference  between  the  Ottinger 
method  and  the  method  used  in  this  report  is  that  Ottinger 
did  not  align  the  words  before  analysis.  This  is  the  only 
major  difference  and  has  resulted  in  two  totally  different 
results.  It  appears  that  tor  this  intelligibility  metric  to 
perform  properly,  the  undistorted  and  distorted  words  must 
be  aligned  before  the  intelligibility  score  is  calculated. 
Since  the  scoring  involves  the  direct  comparison  between  the 
two  words,  it  is  logical  that  the  two  words  must  be  aligned 
for  an  accurate  comparison.  This  alignment  was  within  one 
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sample  out  of  128  samples  per  analysis  window  for  this 
study. 

A  third  conclusion  is  that  the  data  base  should  be 
large  for  an  accurate  objective  score.  This  must  be 
balanced  with  the  goal  of  reducing  the  time  required  for 
scoring.  Each  word  pair  required  approximately  one-half 
hour  for  computer  loading,  alignment,  and  analysis.  This  is 
not  a  real-time  analysis  method,  but  is  still  less  time 
demanding  than  subjective  testing.  This  is  because  subjec¬ 
tive  testing  usually  requires  a  group  of  ten  or  more  people 
for  a  period  of  one-to-two  days  for  one  complete  test  run. 
A  major  problem  in  subjective  testing  is  that  the  listener 
panel  results  tend  to  decay  if  too  many  tests  are  run  in  one 
day.  The  LPC  metric  is  fully  computerized  and  can  be  con¬ 
tinually  run  with  no  degradation  as  long  as  the  computer 
functions  properly. 

Recommendations 

It  is  recommended  that  this  study  be  continued  and  more 
research  done  on  various  other  systems  and  noise  levels. 
Since  this  study  was  done  on  one  communications  system  with 
only  one  noise  level,  it  has  very  limited  results.  Furthor 
testing  should  be  done  to  prove  or  disprove  this  metric  for 
a  wider  range  of  communications  systems  and  noise  levels. 

It  is  further  recommended  that  100  word  pairs  be  used 
instead  of  50  for  a  more  accurate  result.  This  should 
further  reduce  the  problem  of  the  large  standard  deviation 
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encountered.  If  many  more  than  100  are  used,  there  will  be 
little  time  savings  ever  subjective  listener  testing. 
Therefore,  100  appears  to  be  a  good  compromise. 

A  final  recommendation  la  that  the  Ottinger  thesis  (Ref 
13)  be  repeated  but  that  the  words  should  be  aligned  using 
the  method  in  this  report.  If  this  is  done,  it  is  possible 
that  one  of  the  other  metrics  examined  in  that  report  may 
prove  to  be  an  even  better  intelligibility  measure. 
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APPENDIX 


COMPUTER 


PROGRAM  LISTINGS 
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! 


i 

i 


i 


! 


i 


3 


C 

C«***«»«***************«*ft«»***tt**««*»**tt**»»»tf***«»»fttt««tt*»« 

C#«* a*****#####**#**#####**#*##*#**#*#*####* a##*#**##***#**## 

c 

C  SUBROUTINE:  AUTO 

C  A  SUBROUTINE  FOR  IMPLEMENTING  THE  AUTOCORRELATION 
C  METHOD  OF  LINEAR  PREDICTION  ANALYSIS 
C 

C*****»*«*ft*****«ft****ft*********»*****«#»*»***«ft*«**««ft«*««tt» 


C**********ft****************tt*x******ft****»***«ft**ft*tf*«««ft*«ff 

c 


c 

c 

c 

c 

c 

c 

c 

c 

c 

c 


SUBROUTINE  AUTO(N,  X,  M,  At  ALPHA,  RC) 

INPUTS:  N  -  NO.  OF  DATA  POINTS 

X(N)  -  INPUT  DATA  SEQUENCE 
M  -  ORDER  OF  FILTER  (M<21,  SEE  NOTE* ) 
OUTPUTS:  A  -  FILTER  COEFFICIENTS 

ALPHA  -  RESIDUAL  "ENERGY" 

RC  -  REFLECTION  COEFFICIENTS 

♦PROGRAM  .LIMITED  TO  M<21  BECAUSE  OF  DIMENSIONS  OF  R(  . ) 

DIMENSION  X( 256 ) ,  A(?6C),  RC(260) 

DIMENSION  R( 2 1 ) 

MP  =  M  +  1 
DO  20  K=1 , MP 
R(K)  =  0. 

NK  r  N  -  K  +  1 
DO  10  NP= 1 , NK 
N1  =  NP  +  K  -  1 
R(K)  =  R(K)  +  X(NP)*X(N1) 

10  CONTINUE 
20  CONTINUE 

RC(  1 )  =  -R(2)/R(1) 

A ( 1 )  =  1. 

A( 2)  =  RC(  1 ) 

ALPHA  =  R (  1  )  4-  R(2)*RC(1) 

DO  50  MINC=2,M 
S  =  0. 

DO  30  I P= 1 , MINC 

N1  =  MINC  -  IP  +  2 
S  =  S  +  R  (  N 1 )  *  A  ( I P ) 

30  CONTINUE 

RC(MINC)  r  -S/ALPHA 
MH  =  MINC/2  +  1 
DO  MO  I P = 2 , M H 

IB  =  MINC  -  IP  +  2 

AT  =  AflP)  +  RC( MINC) *A( IB) 

A(  IB)  =  A(  IB)  +  RC(  MINC)  *  A(  IP) 

A(  IP)  =  AT 
MO  CONTINUE 

A( MINC+1 )  r  RC(MIUC) 


i 

i 
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ALPHA  =  ALPHA  +  RC(MINC)*S 
IF  (ALPHA)  70,  70,  50 
50  CONTINUE 
60  RETURN 
70  CONTINUE 

WARNING  -  SINGULAR  MATRIX 

IOUTD  =  10 
IOUTP  =  1 

WRITE  (IOUTD, 9999) 

WRITE  (IOUTP, 9999) 

9999  FORMAT  (33H  WARNING  -  SINGULAR  MATRIX  -  AUTO) 
GO  TO  60 
END 
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Cft»ftff*«**«tfft«***ff**»«*****tt******«****»*»**»ft»*««***«*«*»ft«»ft 

c# *###«**#***##*»#***#*#*#**##*********#**#**#***»****##**#** 

c 

c  PROGRAM  BLOCKOUT.FR 

c 

c  FORTRAN  IV  LISTING 

C 

C  LTJG  J.A.  KAYSER,  USCG 

C 

C«*****»*««**««K»**«*****«»***»«********4***»*#*«*ft»*»«*»»««» 

C###**################*##***####*###**##***#####**#**#**#***# 

c 

C  IFILE-  FILE  TO  BE  PRINTED 


JFILE- 


FILENAME  OF  OUTPUT  FILE 


ISTORE- 


256  VALUE  INTEGER  ARRAY  USED  TO 
STORE  THE  VALUES  OF  EACH  BLOCK 
TO  BE  PRINTED. 


ASTORE- 


256  VALUE  REAL  ARRAY  USED  TO 
STORE  THE  CONVERTED  VOLTAGES 
FROM  I STORE 


SBLK- 


STARTING  BLOCK  TO  BE  PRINTED 


CBLK- 


THE  TOTAL  NUMBER  OF  BLOCKS 
TO  BE  PRINTED 


EBLK- 


LAST  BLOCK  TO  BE  PRINTED 


C  ST-  STATUS  CHECK  WORD  USED  IN  THE 

C  CALL  STAT  COMMAND 

C 

c#  *####**##***##**#*#»##*##**##*###**###***####*******»#*#*»* 

c 

C  THIS  PROGRAM  IS  USED  TO  PRINT  OUT  ANY  NUMBER  OF 

C  BLOCKS  IN  A  GIVEN  SPEECH  FILE.  THE  OUTPUT  IS 

C  CONVERTED  TO  A  REAL  VOLTAGE  VALUE  BETWEEN  -5.00 

C  VOLTS  AND  +5.00  VOLTS. 


INTEGER  IFILE( 1 3 ) , SBLK , CBLK , ISTOREC 256 ) ,ST(22) , 
:  IBLOCKS, JFILEC 13) ,EBLK 

REAL  ASTOREC256) 


ENTER  THE  INPUT  FILENAME  TO  BE  PRINTED  OUT  AND 
THE  OUTPUT  FILENAME  DESIRED  TO  STORE  RESULTS 


c 

C 

TYPE  "  INPUT  FILENAME  TO  BE  PRINTED:  " 

READ( 11,20)  IFILE(I) 

20  F0RMAT(S13) 

C 

CALL  STATC IFILE , ST , IER) 

IF(IER.NE.I)  GO  TO  900 
IBL0CKS=ST(9)+1 

TYPE  "  BLOCK  COUNT  IS  ",IBLOCKS 
CALL  OPEN(5,IFIL£,2,IER) 

IF(IER.NE.I)  GO  TO  900 
C 

TYPE  "  INPUT  DESIRED  OUTPUT  FILENAME:  " 

READ( 11,30)  JFILE(I) 

30  FORMATC  SI  3) 

CALL  OPEN (6 ,JFILE,2,IER) 

IF(IER.NE.I)  GO  TO  900 
C 

C*  ***»***»##  »«»**#***#*****#***«#**#««***#»*****#**»»***«**«» 

C 

C  ENTER  STARTING  BLOCK  AND  NUMBER  OF  BLOCKS  TO  BE 

C  PRINTED.  IF  THIS  EXCEEDS  THE  LAST  BLOCK  THEN  THE 

C  NUMBER  OF  BLOCKS  TO  BE  PRINTED  IS  READJUSTED  TO 

C  STOP  AT  THE  LAST  BLOCK. 

C 

C«**ftft«ft*«»tt«X*«**ft*ft*««**«***ff««*X»*»*ft»***««*ti*«****«**»»ttK 

C 

ACCEPT  "  INPUT  STARTING  BLOCK:  " , SBLK 
C 

ACCEPT  "  INPUT  NUMBER  OF  BLOCKS  TO  BE  PRINTED:  ",CBLK 
EBLK-SBLK+CB* K 

IF( EBLK. GT. ST( 9) ) TYPE  "  ADJUSTED  END  BLOCK  TO:  ”,ST(9) 
IF(EBLK.GT.ST(9))EBLK=ST(9) 

55  CALL  RDBLKC5,SBLK,IST0RE, 1 , IER) 

IF(IER.NE.l)  GO  TO  900 
C 

C**ft««*«««****ft********«»*X«««*««XX»«X»X*»tt#«Xft«***ft«X«««tt**fi 

C 

C  CONVERT  EACH  BLOCK  TO  BE  PRINTED  INTO  VOLTAGES  AND 

C  STORE  IN  THE  ARRAY  ASTORE.  WRITE  ASTORE  INTO  THE 

C  FILE  NAMED  BY  JFILE 

C 

CttX***«X***«X*#X«***XKXXXXttft»*»XXX«*ft*»***«*«X* CXXXXKXXXXXXXX 

c 

DO  60  1=1 ,256 

ASTORE ( I)=(IS70RE(I)/2048.0)*5.0 
60  CONTINUE 

C 

WRITE(6,70) SBLK, IFILE(I) 

70  FORMATC  "//"  BLOCK  NUMBER:" ,  13 , "  FILENAME:  »,S13,/) 


50 


WRITE (6 ,  80)  ( ASTORE(K) ,K=1 ,256) 

80  FORMATC  '',16F6.2) 

C 

SBLK=SBLK+1 

IF(SBLK.LE.EBLK)  GO  TO  55 
C 

TYPE  "  END  OF  PROGRAM.  YOUR  OUTPUT  IS  " 

WRITE(10,85) JFILE(I) 

85  FORMATC  LpCATED  IN  A  FILE  NAMED;  ,  S 1 3 ) 

GO  TO  915 
C 

CS*ft«**«««****»«*«**»*«ft«»***»*«***ft«ft**ft*it«*****»»********ft» 

C 

C  ERROR  MESSAGE  ROUTINE.  IT  IS  USED  WHEN  THE  IER 

C  VARIABLE  IS  RETURNED  AS  A  NON-ONE  VALUE  DURING 

C  ANY  CALL  COMMAND  IN  THE  PROGRAM. 

C 

************************************************  ******* 

c 

900  TYPE  "  <7><7>**  FORTRAN  IV  SYSTEM  ERROR  **<7><7>  " 
TYPE  "  ERROR  CODE=",IER 
TYPE  "  PROGRAM  ABORTED  " 

C 

c««*« **********************************************  ********** 

c 

C  END  OF  PROGRAM.  USES  A  CALL  RESET  COMMAND  TO 

C  RESET  ALL  CHANNELS  OPENED. 

C 

c* *********************************************************** 

c 

915  TYPE  "  END  OF  PROGRAM  " 

CALL  RESET 
END 


■n.  p 


l 


ki 


C******»##***###**»*#»********##******##»****#**###****«#«*»* 
(;*•*#»**###*#**********#**#**«****«**»###*#***»*##**### <#*«#* 

c 

C  PROGRAM  SHIFT.FR 

c 

C  FORTRAN  IV  LISTING 

C 

C  LTJG  J. A,  KAYSER,  USCG 

C 

C#**##*******###**#******;**#**#**#***###***##*#****##*****#** 
Q* *##***######*****#***#*##**###**##***»**#***#**#**###*#*#** 

c 

NUMBER  OF  BLOCKS  IN  FILE 


IBLOCKS- 

IFILE- 

JFILE- 

ISHIFT- 

ISTORE- 

JSTORE- 


FILE  TO  BE  SHIFTED 

FILENAME  OF  SHIFTED  OUTPUT  FILE 

NUMBER  OF  PLACES  TO  BE  SHIFTED 

256  VALUE  ARRAY  USED  TO  STORE 
ONE  BLOCK  OF  UNSHIFTED  DATA 
FROM  IFILE . 

256  VALUE  ARRAY  USED  TO  STORE 
ONE  BLOCK  OF  SHIFTED  DATA  FOR 
STORAGE  IN  JFILE 

DIRECTION  OF  SHIFT  ( 1 =UP/2=D0WN) 

C 

C  SBLK-  STARTING  BLOCK  OF  THE  SHIFT 

C 

C  ST-  STATUS  CHECK  WORD  USED  IN 

C  THE  CALL  STAT  COMMAND  ( 

C 

C* *#####* **#*»*##*##»**»»*****#*##»***###*#»*#####**#*#*«#**» 

C 

C  THIS  PROGRAM  IS  USED  TO  ALIGN  VOICE  FILES  BY 

C  MOVING  EACH  WORD  IN  THE  FILE  BY  A  NUMBER  OF 

C  PLACES  AS  SELECCTED  BY  THE  USER.  IT  CAN  MOVE 

C  EITHER  UP  OR  DOWN.  THE  END  BLOCKS  ARE  ZERO 

C  FILLED  AS  NECESSARY  FOR  THE  SIZE  OF  THE  SHIFT. 

C 

C*  *###*####»«**##****.  *##*###*#*#***##****»#*#*******■«**##*?»* 

c 

INTEGER  I BLOCKS, ISHIFT,IFILE( 13) , MOVES , JFILE( 1 3) , 

:  ISTORE( 256 ) , JSTOREC 256 ) , IER , ST(  22) , SBLK 

C 

c*  *###«*#»*#*  ***«*##»***#***#  *;•##*##*#*##*****##*«#*#*##****** 

C 

C  ZERO  OUT  THE  ISTCRE  AND  JSTORE  ARRAYS  AT  START 
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C« •««*»•»*»»***»»*»***&**** *****tt******»»*«««****ftft*4***»**** 

c 

DO  10  1=1,256 
ISTORE( I) =0 
JSTORE( I) =0 
10  CONTINUE 

C 

c 

C  ENTER  THE  INPUT  FILENAME  TO  BE  SHIFTED  AND  THE 

C  OUTPUT  FILENAME  TO  STORE  RESULTS  OF  THE  SHIFT 

C 

c» I*##*********####*#*#***#**#**#***#*****#**#*#*##*##******* 

c 

TYPE  *’  ENTER  FILE  NAME  TO  BE  SHIFTED:  " 

READ  (11,15)  IFILEC 1 ) 

15  FORMAT( SI  3 ) 

CALL  STAr(IFILE,ST,IER) 

IF(IER.NE.I)  GO  TO  900 
IBL0CKS=ST(9)+1 

TYPE  "  BLOCK  COUNT  IS  ",IBLOCKS 
CALL  OPEN(5,IFILE,2,IER) 

IF( IER. NE. 1 )  GO  TO  900 
TYPE  "  FILE  IS  OPEN.  " 

TYPE  "  ENTER  OUT  PUT  ’  FILENAME  DESIRED:  « 

READ  (11,5)  JFILE(I) 

5  FORMAT( SI  3 ) 

C 

C#*«***»»**#«#*#*#**»***#*#**»***************  **************** 

C 

C  INPUT  THE  AMMOUNT  OF  THE  SHIFT  AND  THE  DIRECTION 

C  OF  THE  SHIFT.  1=UP  AMD  2=D0WN. 

C 

C****««*»********»»»*»*»»ft*#«»****»**»***#»»********»*»*»tf»»» 

c 

ACCEPT  "<15>  HOW  MANY  PLACES  WILL  THE  SHIFT  INV0LVE?<15> 
.  :  SHIFT=  ",ISHIFT 

IF( I SHIFT. LE. 256)  GO  TO  20 

TYPE  »  EDIT  THIS  FILE.  YOUR  SHIFT  IS  GREATER  THAN  256.” 
GO  TO  910 
C 

20  TYPE  "  SHIFT  UP  OR  DOWN.  INPUT  1  FOR  U?  OR  2  FOR  DOWN." 
ACCEPT  "  DIRECTIONS  ",  MOVES 
IF( MOVES . EQ . 2 )  GO  TO  500 
IF( MOVES. EQ. 1 )  GO  TO  25 

TYPE  ”  YOU  MUST  SELECT  A  1  OR  2  ONLY ! I ! < 1 5>" 

GO  TO  20 
C 

C*  ******#*»****#»#*#**»**##**#****)*#»**«*  **#***##*«»**«#***** 

r* 

u 

C  THIS  IS  THE  START  OF  THE  SHIFT  UP  ROUTINE.  IT 
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C  STARTS  WITH  THE  LAST  BLOCK  IN  IFILE  AND  SHIFTS 

C  UP  EACH  BLOCK  AN  AMMOUNT  OF  ISHIFT  UNTIL  IT  HAS 

C  SHIFTED  UP  THE  ENTIRE  FIRST  BLOCK.  IT  STORES 

C  THE  RESULTS  IN  THE  FILE  JFILE  VIA  THE  JSTORE  ARRAY 

C 

£»»**«»»**»»**«»*****»***#*»»*#**#********#*#*##*#**»#»#***»» 

C 

25  TYPE  "  YOU  HAVE  SELECTED  A  SHIFT  OF  " 

TYPE  ISHIFT,"  PLACES  IN  THE  UP  DIRECTION." 

CALL  OPEN(6 ,JFILE,2,IER) 

IF(IER.NE.I)  GO  TO  900 
SBLKsST ( 9 ) 

CALL  RDBLK( 5 , SBLK , ISTORE ,1 , IER) 

IF(IER.NE.I)  GO  TO  900 
II=256-ISHIFT 
JJ=256 
C 

30  J STORE ( JJ)=ISTORE(II) 

JJ=JJ-1 

11=11-1 

IF(II.GE.I)  GO  TO  30 
C 

35  SBLK=SBLK-1 

CALL  RDBLKC5, SBLK, ISTORE, 1> IER) 

11=256 

C 

40  JSTOREC  J  J ) =ISTORE( II ) 

JJ=JJ-1 

11=11-1 

• IF(II.LT.I)  GO  TO  50 
IF(JJ.GE.I)  GO  TO  40 
C 

CALL  WRBLKC6 , SBLK+1 , JSTORE, 1 , IER) 

IF(IER.NE.I)  GO  TO  900 
JJ=256 

-DO  45  1=1,256 
v STORE( I) =0 

45  CONTINUE  1 

GO  TO  40 
C 

50  IF( SBLK.GT. 0)  GO  TO  35 

CALL  WRBLKC6, SBLK, JSTORE, 1 , IER) 

IF (IER. NE. 1 )  GO  TO  900 
GO  TO  915 

•ft»«**«*«*»*****ft»ft»«ft*»«««*»«***»«******«*****ttft**«»«ft«***» 


THIS  IS  THE  START  OF  THE  SHIFT  DOWN  ROUTINE.  IT 
STARTS  WITH  THE  FIRST  BLOCK  IN  IFILE  AND  SHIFTS 
DOWN  EACH  BLOCK  AN  AMMOUNT  ISHIFT  UNTIL  IT  HAS 
SHIFTED  THE  ENTIRE  LAST  BLOCK.  IT  STORES  THE 
RESULTS  IN  THE  FILE  JFILE  VIA  THE  JSTORE  ARRAY. 
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c 

Cx»»xxx«xx«xxxxxxxxxx#x#x#»§xx*xxxxxxxxxxxixx»xxxx*xxx»»xx#*x 

C 

500  TYPE  ■  YOU  HAVE  SELECTED  A  SHIFT  OF  " 

TYPE  ISHIFT,"  PLACES  IN  THE  DOWN  DIRECTION." 

CALL  0PEN(6,JFILE,2,IER) 

IF(IER.NE.I)  GO  TO  900 
SBLK=0 

CALL  RDBLKC 5 , SBLK , ISTORE , 1 , IER) 

IF(IER.NE.I)  GO  TO  900 
IIsISHIFT 
JJ  =  1 
C 

510  JSTORE(JJ)=ISTORE(II) 

JJrJJ+1 

11=11+1 

IFCII.LE.256)  GO  TO  510 
C 

520  SBLK=SBLK+1 

CALL  RDBLKC 5, SBLK, ISTORE, 1 ,IER) 

IF ( IER. NE. 1 )  GO  TO  900 
11=1 
C 

530  JSTOREC JJ)=IST0RE(II) 

JJ=JJ+1 

11=11+1 

IFCII.GT.256)  GO  TO  540 
IFCJJ.LE.256)  GO  TO  530 
C 

‘CALL  WRBLK(6 , SBLK-1 ,J STORE, 1 ,IER) 

IF(IER.NE.I)  GO  TO  900 
J  J  =  1 

DO  535  1=1,256 
JSTOREC I) =0 
535  CONTINUE 

GO  TO  530 
C 

540  IFCSBLK.LT. IBLOCKS-1)  GO  TO  520 

CALL  WRBLKC6, SBLK, JST0RE,1 , IER) 

IFCIER.NE.1)  GO  TO  900 
GO  TO  915 

ft************************************************************ 


THIS  IS  THE  ERROR  MESSAGE  ROUTINE.  IT  IS  USED 
WHEN  THE  IER  VARIABLE  IS  RETURNED  AS  A  NON-ONE 
VALUE  DURING  ANY  CALL  COMMAND  IN  THE  PROGRAM. 

**Sft«»*»**#****«ft***«*X***«*ft*******«»***«X**»X«**«ttK*ft***** 


900  TYPE  "  <7><7>**  FORTRAN  IV  SYSTEM  ERROR  **<7><7>" 
TYPE  "  ERROR  CODE  =  ",IER 
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on  on 


TYPE  "  RETURNED  ON  A  CALL  COMMAND." 

910  TYPE  "  PROGRAM  ABORTED  <7><7>" 

C 

c* ft********************************************************* 

END  OF  PROGRAM.  USES  A  CALL  RESET  COMMAND 
TO  RESET  ALL  CHANNELS  OPENED. 

C* #»*«**•»*««»«»#*»##**«***«**»»**#*****#»#**##*#*#»*##•#»»# 

c 

915  TYPE  n  END  OF  PROGRAM.  » 

CALL  RESET 
END 
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****»*««*****»#**#****#**#*#*#*#»*###*#* *»*###**##»#* 
£**••«»••*•»»**«•**********»************»«»**«*««««**««*«•*••'] 
C  I 

c  PROGRAM  AUTOLPC.FR  '  | 

c  1 

C  FORTRAN  IV  LISTING  ' 

C 

C  LTJG  JEFFREY  A.  KAYSER,  USCG 

C 

•*»•*«*»#**« »*»*««****# ##*«###*#»#«***#** *»»»«**«*«**« 
C**»*  ####*»#  JHMUHUHHHHMHHt  *#*#  ##*#**###***#  *******»»««***««**» 

c 

INTEGER  ISTORE(256) , JSTORE( 256 ) , IFILEC 1 3) , JFILE( 1 3) , 

:  AST( 22) , DST( 22) , ST , IBLOCKS , M , SBLK,  IER  ,  J  ,  K 
REAL  ALPHAI, ALPHADfX(256) 

DIMENSION  AI(260) , AD (260) fRC 1(260) ,RCD(260) 

C 

(;*»**####«#*##***********#*****##****#*###*#****#«»#*****•*#* 

c 

C  THIS  SECTION  READS  IN  THE  NAMES  OF  THE  FILES 

C  TO  BE  ANALYZED  USING  THE  LPC  SUBROUTINE  CALLED 

C  AUTO. FR.  THIS  IS  A  FORTRAN  IV  SUBROUTINE. 

C 

c************************************************************ 

c 

TYPE  "  ENTER  ANALOG  FILE  TO  BE  TESTED:  " 

READ( 11,10)  IFILE(I) 

TYPE  "  ENTER  DISTORTED  FILE  TO  BE  TESTED:  " 

READ( 11,10)  JFILE(I) 

•10  FORMAT(  SI  3 ) 

CALL  STAT( IFILE, AST, IER) 

IF(IER.NE.I)  GO  TO  900 
CALL  STAT(JFILE,DST,IER) 

IF( IER. NE. 1 )G0  TO  900 
ST=DST( 9 ) 

IF(AST(9).LT.DST(9))  ST=AST(9) 

IBL0CKS=ST+1 

TYPE  "  MAXIMUM  BLOCK  COUNT  IS  ",IBLOCKS 
CALL  OPEN( 1 , IFILE, 2, IER) 

IF(IER.NE.I)  GO  TO  900 
CALL  0PENC2, JFILE,2,IER) 

IF(IER.NE. 1)  GO  TO  900 
CALL  OPEN ( 3 , " J  AK7  8" , 2 , 1 ER ) 

IF(IER.NE. 1 )  GO  TO  900 
ALSUMI=0 . 00 
ALSUMD=0 . 00 
C 

c* #*#»#####**##*####**#**#**##***#*#**####****#*##*#####»*#*# 

c 

C  THE  FOLLOWING  SECTION  READS  IN  THE  DESIRED  FILTER 

C  SIZE,  NUMBER  OF  FOINTS  PER  ANALYSIS  WINDOW,  AND 

C  THE  STARTING  AND  ENDING  BLOCKS  DESIRED.  NOTE  THAT 
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C  THE  FILTER  SIZE  MUST  BE  LESS  THAN  THE  NUMBER  OF 

C  POINTS  PER  WINDOW  AND  A  MAXIMUM  OF  21. 

C 

£*»••«*«««««««»**«»***•*«»»***»*««#*»«**«*****««**»«*»**#«*»* 

c 

20  TYPE  "  ENTER  LPC  FILTER  SIZE  (<21):  " 

ACCEPT  "  SIZE=  " , M  • 

IF( M. LT. 21 )  GO  TO  30 

TYPE  "  MUST  BE  LESS  THAN  21  !  " 

GO  TO  20 
C 

30  ACCEPT  »  NUMBER  OF  POINTS=  " , IVAL 
ACCEPT  "  START  BLOrK="  ,  SBLK 
ACCEPT  "  END  BLOCK=" , EBLK 
IF(EBLK.GT.ST)  EBLKrST 
LOOP=SBLK 
C 

C«*H**l«««»llK*»*************«***t«»»**0«*****«««M«H««*««* 

c 

c  THE  FOLLOWING  SECTION  READS  EACH  BLOCK  OF  THE 

C  SELECTED  SPEECH  FILE  AND  CALLS  THE  SUBROUTINE 

C  AUTO  TO  CALCULATE  THE  LPC  COEFFICIENTS. 

C 

C* **»#»«*»#*«##»*#»#»*****#*##*###****#»*##*##**#******#***#* 

c 

35  CALL  RDBLK( 1 , SBLK, ISTORE , 1 , IER) 

IF(IER.NE.I)  GO  TO  900 
NrIVAL 
J  =  0 
C 

40  DO  50  1=1 ,  IVAL 

L  =  I  +  J 

X( I) = ( ( ISTORE ( L)/2048.0)*5.0) 

50  CONTINUE 

C 

CALL  AUTO(N,X,M,AI,ALPHAI,RCI) 

ALSUMIs ALSUMI+ALPHAI 
REF=0 . 0 
DO  55  11  =  1, M 
REF=REF+RCI(II) 

55  CONTINUE 

J= J+IVAL 

IF(J.LT.255)  GO  TO  40 
C 

SBLK=SBLK+1 

IF(SBLK. LE. EBLK)  GO  TO  35 
SBLK=LOOP 

75  CALL  RD3LK(2.S3LK, JSTORE, 1 ,IER) 

IF(IER.NE.I)  GO  TO  900 
J  =  0 

90  DO  100  1=1, IVAL 

L  =  I+J 
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X(I)=((JST0RE(L)/2Q48.0)*5.0) 

100  CONTINUE 
C 

CALL  AUTO( N , X, M, AD , ALPHAD, RCD) 

ALSUMD=ALSUMD+ALPHAD 
REFsO . 0 
DO  105  11=1  ,M 
REF=REF+RCD( II) 

105  CONTINUE 
J=J+IVAL 

IF( J . LT. 256)  GO  TO  90 
SBLK=SBLK+1 

IF(SBLK. LE. EBLK)GO  TO  75 
GO  TO  915 
C 
C 

C ************************************************************ 

c 

C  ERROR  ROUTINE  SECTION.  THIS  SECCTION  IS  USED 

C  WHEN  A  NON-ONE  VALUE  IS  RETURNED  BY  THE  IER 

C  VARIABLE  ON  ANY  CALL  COMMAND. 

C 

CC* *************************************************** ******* 

c 

900  TYPE  "  <7><7>  FORTRAN  IV  SYSTEM  ERROR  <7><7>" 

TYPE  "  ERROR  =",IER 

TYPE  "  ERROR  IS  ON  A  CALL  COMMAND" 

C 

Ce****«*********»****ft*»»*««««**«ft*«ft«»s*******tt*«»*»******«** 

C  ’ 

C  END  OF  PROGRAM.  THE  RESULTS  ARE  PRINTED  TO 

C  THE  SCREEN  AS  A  RESULT  CALLED  SUMS. 

C 

c 

915  TYPE  "  END  OF  PROGRAM” 

SUMB= ( 1 00. 0-<( ALSUMD-ALSUMD/ALSUMD)* 100.0) 

TYPE  »  SUMB=" , SUMB 

TYPE  "  ALSU  MI  =  " , ALSU MI , "  ALSUMD=" , ALSUMD 

CALL  RESET 

END 
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